Filtering spam from bad neighborhoods

نویسندگان

  • Ward van Wanrooij
  • Aiko Pras
چکیده

One of the most annoying problems on the Internet is spam. To fight spam, many approaches have been proposed over the years. Most of these approaches involve scanning the entire contents of e-mail messages in an attempt to detect suspicious keywords and patterns. Although such approaches are relatively effective, they also show some disadvantages. Therefore an interesting question is whether it would be possible to effectively detect spam without analyzing the entire contents of e-mail messages. The contribution of this paper is to present an alternative spam detection approach, which relies solely on analyzing the origin (IP address) of e-mail messages, as well as possible links within the e-mail messages to websites (URIs). Compared to analyzing suspicious keywords and patterns, detection and analysis of URIs is relatively simple. The IP addresses and URIs are compared to various kinds of blacklists; a hit increases the probability of the message being spam. Although the idea of using blacklists is well known, the novel idea proposed within this paper is to introduce the concept of ‘bad neighborhoods’. To validate our approach, a prototype has been developed and tested on our university’s mail server. The outcome was compared to SpamAssassin and mail server log files. The result of that comparison was that our prototype showed remarkably good detection capabilities (comparable to SpamAssassin), but puts only a small load on the mail server. Copyright © 2010 John Wiley & Sons, Ltd.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spam Mail Filtering through Dynamically Updating URL Statistics

This paper presents a unique spam mail filtering technique based on a deep analysis of statistics on URL’s included in various e-mails gathered from a laboratory in a university for about six months. Since the proposed mail filtering technique searches only URL’s in mail, the overhead introduced by searching all mail contents or black list utilized by many other mail filtering algorithms is sig...

متن کامل

The Concept of Embedded Values and the Example of Internet Security

Many current technological devices used in our everyday lives confront us with a host of new ethical issues to be addressed. Facebook, Twitter, or smart phones are all examples of technologies used quite pervasively which call into question culturally significant values like privacy, among others. The embedded values concept presents the compelling idea that engineers, scientists and designers ...

متن کامل

A Multifaceted Approach to Spam Reduction

As we think about the history of spam reduction, we can see a gradual change in the approach over time, as the spam problem has changed. Many of us may think of spam as a new problem, but in fact, it goes back at least to 1975, as noted by the late Jon Postel.[1] At the start users were mostly “techies”, and spam mostly referred to Usenet newsgroup posts that got out of hand, wherein someone wo...

متن کامل

SMS spam filtering: Methods and data

Mobile or SMS spam is a real and growing problem primarily due to the availability of very cheap bulk pre-pay SMS packages and the fact that SMS engenders higher response rates as it is a trusted and personal service. SMS spam filtering is a relatively new task which inherits many issues and solutions from email spam filtering. However it poses its own specific challenges. This paper motivates ...

متن کامل

Single-Class Learning for Spam Filtering: An Ensemble Approach

Spam, also known as Unsolicited Commercial Email (UCE), has been an increasingly annoying problem to individuals and organizations. Most of prior research formulated spam filtering as a classical text categorization task, in which training examples must include both spam emails (positive examples) and legitimate mails (negatives). However, in many spam filtering scenarios, obtaining legitimate ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. Journal of Network Management

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2010